23 research outputs found
Machine Learnig for Robotic Manipulation in cluttered environments
In this thesis we focus on designing the planner for MIT s entry in the Amazon Picking Challenge, a robotic competition aiming at pushing the frontiers of manipulation until robots can substitute human pickers in warehouses. Given a set of manipulation primitives (such as grasping, suction, scooping, placing or pushing) we designed a system capable of learning a planner from a set of manipulation experiments. After learning, given any configuration of objects, the planner can come up with the optimal sequence of primitives applied to any object on the scene so as to maximize the probability of successfully picking the goal object. In doing this research we have analyzed Reinforcement Learning, Deep Learning and Planning approaches. For each one, we first describe the background theory, characterizing it for our application to robotics. Then we describe a prototype done in the area and the lessons learned from it. Finally, we combine the strengths of all the areas to create the final design of our system
Graph Element Networks: adaptive, structured computation and memory
We explore the use of graph neural networks (GNNs) to model spatial processes
in which there is no a priori graphical structure. Similar to finite element
analysis, we assign nodes of a GNN to spatial locations and use a computational
process defined on the graph to model the relationship between an initial
function defined over a space and a resulting function in the same space. We
use GNNs as a computational substrate, and show that the locations of the nodes
in space as well as their connectivity can be optimized to focus on the most
complex parts of the space. Moreover, this representational strategy allows the
learned input-output relationship to generalize over the size of the underlying
space and run the same model at different levels of precision, trading
computation for accuracy. We demonstrate this method on a traditional PDE
problem, a physical prediction problem from robotics, and learning to predict
scene images from novel viewpoints.Comment: Accepted to ICML 201
Robotic Pick-and-Place of Novel Objects in Clutter with Multi-Affordance Grasping and Cross-Domain Image Matching
This paper presents a robotic pick-and-place system that is capable of
grasping and recognizing both known and novel objects in cluttered
environments. The key new feature of the system is that it handles a wide range
of object categories without needing any task-specific training data for novel
objects. To achieve this, it first uses a category-agnostic affordance
prediction algorithm to select and execute among four different grasping
primitive behaviors. It then recognizes picked objects with a cross-domain
image classification framework that matches observed images to product images.
Since product images are readily available for a wide range of objects (e.g.,
from the web), the system works out-of-the-box for novel objects without
requiring any additional training data. Exhaustive experimental results
demonstrate that our multi-affordance grasping achieves high success rates for
a wide variety of objects in clutter, and our recognition algorithm achieves
high accuracy for both known and novel grasped objects. The approach was part
of the MIT-Princeton Team system that took 1st place in the stowing task at the
2017 Amazon Robotics Challenge. All code, datasets, and pre-trained models are
available online at http://arc.cs.princeton.eduComment: Project webpage: http://arc.cs.princeton.edu Summary video:
https://youtu.be/6fG7zwGfIk
GraphCast: Learning skillful medium-range global weather forecasting
We introduce a machine-learning (ML)-based weather simulator--called
"GraphCast"--which outperforms the most accurate deterministic operational
medium-range weather forecasting system in the world, as well as all previous
ML baselines. GraphCast is an autoregressive model, based on graph neural
networks and a novel high-resolution multi-scale mesh representation, which we
trained on historical weather data from the European Centre for Medium-Range
Weather Forecasts (ECMWF)'s ERA5 reanalysis archive. It can make 10-day
forecasts, at 6-hour time intervals, of five surface variables and six
atmospheric variables, each at 37 vertical pressure levels, on a 0.25-degree
latitude-longitude grid, which corresponds to roughly 25 x 25 kilometer
resolution at the equator. Our results show GraphCast is more accurate than
ECMWF's deterministic operational forecasting system, HRES, on 90.0% of the
2760 variable and lead time combinations we evaluated. GraphCast also
outperforms the most accurate previous ML-based weather forecasting model on
99.2% of the 252 targets it reported. GraphCast can generate a 10-day forecast
(35 gigabytes of data) in under 60 seconds on Cloud TPU v4 hardware. Unlike
traditional forecasting methods, ML-based forecasting scales well with data: by
training on bigger, higher quality, and more recent data, the skill of the
forecasts can improve. Together these results represent a key step forward in
complementing and improving weather modeling with ML, open new opportunities
for fast, accurate forecasting, and help realize the promise of ML-based
simulation in the physical sciences.Comment: Main text: 21 pages, 8 figures, 1 table. Appendix: 15 pages, 5
figures, 2 table